Markov chain methods for small-set expansion 
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O ' Abstract 



Consider a finite irreducible Markov chain with invariant distribution 7T. We use the inner 



product induced by ir and the associated heat operator to simplify and generalize some results 



related to graph partitioning and the small-set expansion problem. For example, Steurer showed 
a tight connection between the number of small eigenvalues of a graph's Laplacian and the 
■ expansion of small sets in that graph. We give a simplified proof which generalizes to the 

nonregular, directed case. This result implies an approximation algorithm for an "analytic" 
version of the Small-Set Expansion Problem, which, in turn, immediately gives an approximation 
algorithm for Small-Set Expansion. We also give a simpler proof of a lower bound on the 
probability that a random walk stays within a set; this result was used in some recent works on 



finding small sparse cuts. 

CN ! 1 Overview 

> 

Graph partitioning using spectral methods has recently been the subject of intensive study. Many 
results in this area have been proven using discrete-time random walks. However, these techniques 
work best when applied to regular graphs with nonnegative eigenvalues. As a result, it has become 
standard to move to a lazy version of a graph by adding self-loops, i.e. using (/ + K)/2 instead 
of K as the adjacency matrix. Much work has also focused on regular graphs only or considered 
the normalized Laplacian D~ 1 / 2 LD~ l l 2 . 

In this work we show that these problems can be avoided using Markov chain techniques, leading 
to simpler and more general proofs of results related to spectral graph partitioning. Rather than 
using discrete-time random walks, we consider continuous-time random walks and the associated 
heat operator. "Smoothing out" the random walk makes the eigenvalues nonnegative, avoiding 
the need to move to lazy graphs and allowing our techniques to be directly applied to the original 
instance. In addition, we use the inner product defined with respect the invariant distribution tt of 
the Markov chain representing a random walk on the graph. We are then able to use our methods 
directly on nonregular graphs. 

We will now give a brief description of some previous results in spectral graph partitioning. 
Let G = (V, E) be a graph on n vertices. Let K be its (normalized) adjacency matrix, let L 
be its (normalized) Laplacian matrix (namely / — K), and let = Ai < A2 < • • • < A n < 2 be 

the eigenvalues of L. The conductance <J>[S1 of a set S C V is defined to be v E ^ S £l/„\ ■ The 
conductance profile of G, denoted $g , was defined by Lovasz and Kannan [LK99] as 

$ G (r) = min{$[S] : S C V,fi[S\ < r}. 
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Cheeger's inequality for graphs [AM85, AI086, SJ89] states that V can be partitioned into 
nonempty Si, S2 such that $>[Si\ < 0(yf\2). Very recently, Louis, Raghavendra, Tetali, and 
Vempala [LRTV12] and Lee, Oveis Gharan, and Trevisan [LOT12] have given a "higher order 
Cheeger inequality" involving higher eigenvalues. Specifically, the two results show that for any k, 
one can partition V into J2(/e) disjoint nonempty sets Si, each of which has conductance Q[Si] < 
0(y/\ k log k). Since one of these parts has volume (i[Si\ := |5j|/|V| < 0(1/ k) we may conclude 
that 

$(? (const) < O(v^toifc). (1) 

As noted in these works, for a fixed k the "extra factor" of 0(\/log k) in (1) is necessary; indeed 
this is true [LOT12] for all k < log 2 n. However, somewhat intriguingly, the extra factor becomes 
unnecessary once k is as large as n 52 ' 1 ' — at least, if one is willing to compromise somewhat on the 
volume parameter. Specifically, Arora, Barak, and Steurer [ABS10] showed for regular graphs that 

<MO(£T 1/100 )) < 0(^/\ k \og k n). (2) 

In his thesis, Steurer [StelO] improved this bound to 

® G (k- l+l/A )<0(^A\ k \og k n) for any (sufficiently large) constant A. (3) 

Using Markov chain methods, we give what we feel is a much simpler proof of this result, which 
also works for the nonregular (and also directed) case. Our result also implies an approximation 
algorithm for an "analytic" version of the Small-Set Expansion problem. This, in turn, immediately 
gives an approximation algorithm for Small-Set Expansion by a standard version of Cheeger's 
Inequality, 

In somewhat related recent work, Oveis Gharan and Trevisan [OT12] proved a weaker version of 
this bound with A; -1 / 3 in place of k~ 1+1 / A . The main point of that work, along with the independent 
work of Kwok and Lau [KL12] give a polynomial-time algorithm for the Small-Set Expansion 
problem in an unweighted (nonregular) graph G = (V, E) with the following guarantee: if there 
exists S QV with fi[S] < 5 and $[S] < e, the algorithm finds T C V with fj,[T] < 0(5) ■ (5\E\) a 
and fi[S] < 0(\J e/a) (for any small a > 0). To achieve this, both papers prove a theorem stating 
that for any S C V and integer t > 0, the probability that a t-step random walk starting from a 

random x G S stays entirely within S is at least ^1 — ^f^J ■ We also give a simpler proof of this 
result for continuous-time random walks. 

1.1 Our results 

1.1.1 Bounding the spectral profile 

In this work we provide a different, simple proof of Steurer's improved result using continuous-time 
random walks instead of lazy discrete-time random walks: 

Theorem 1.1. In any strongly connected graph G, < I ) g , (16A;~ 1+1//j4 ) < 1\[A ■ yj X k log k n for any 
real A > 3. 

For example, $g(A; _ ' 999 ) < 0(yAfe log fc n) for k sufficiently large. See Section 2 for the appropriate 
definitions of L, Aj, etc. in the context of general graphs G. 

In fact, our result is stronger than this in that we are able to directly bound the spectral 
profile of G. (The same is true of the result in Arora-Barak-Steurer [ABS10] and in Steurer's 
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thesis [StelO].) Recall that the spectral profile of G, introduced by Goel, Montenegro, and 
Tetali [GMT06], is defined by 

A-g( t ) = mm | ^\\f\p : nonzero / : V — * R~° with 7r(supp(/)) < rX . 

Goel, Montenegro, and Tetali showed that the "Cheeger rounding analysis" yields the following 
relationship with conductance profile: <&c( r ) < a/2Aq (r) for all r. 1 As in [ABS10] we work with a 
slightly different definition of spectral profile, for technical convenience: 

1 

■2 WJ 112 

are appropriate generalizations of boundary size and volume to functions / : V — > R. (These 
definitions agree with our earlier ones when / is the 0-1 indicator of a set S C V.) As noted 
in [ABS10, Lemma A. 2] we have Ac(4r) < 2A' G (r) for all r. (A similar reverse connection also 
holds.) Thus: 



A' G (r) = minW] : M/] < r}, where *[/] = = M 



Theorem 1.2. (Essentially from [GMT06].) $ G (4r) < 2 y / A^(f) /or a// r. 

We use this connection to obtain Theorem 1.1; our main theorem is in fact: 

Theorem 1.3. In any strongly connected graph G, A' G (4k~ 1+1 / A ) < A-X^ log fc n for any real A > 3. 

This route to bounding the conductance profile is somewhat in contrast to the works [LRTV12, 
LOT12], both of which combine their spectral analysis and "rounding algorithm". 

Indeed, in this work we consider the "analytic" version of the Raghavendra-Steurer [RS10] 
Small-Set Expansion problem: given a graph G = (V, E) with the promise that there is a function 
/ : V — > R which has fi[f] < 5 and <&[/] < e, find a function g : V — > R with fi\g] < 0(6) and 
$[g] as small as possible. Following [ABS10], we provide an eigenspace enumeration lemma which, 
when combined with Theorem 1.3, yields the following: 



Theorem 1.4. For any a < | and C > 1, i/iere exists an algorithm running in time exp(0(n°) • 
j\og{C/S)) with the following guarantee: If there exists f : V — > R wt/i /i[f] < 5 < 1/2 and 
$[/] < e < 1/4, ffte algorithm finds g : V -> R wfft < 5- (1 + 1/C) and $[5] < O(^) • e. 



As a byproduct, using Theorem 1.2 we can immediately deduce the following approximation 
algorithm for Small-Set Expansion: 

Corollary 1.5. Fix any small constants a, 5 > 0. Then there is an algorithm running in time 
exp(0(n a )) with the following guarantee: If there exists S C V with fj,[S] < 5 and &[S] < e, the 
algorithm finds T C V with /j,[T] < 55 and 3>[T] < 0(y/e). 

More generally, one can obtain $[T] < 0{e^/ 2 ) in time exp(0(n ael P )) for any < /3 < 1. 

This result is incomparable with the Arora-Barak-Steurer Small-Set Expansion algorithm: their 
work had O(e / 3 ) in place of 0(e^ 2 ) and was analyzed only for regular graphs. On the other hand, 
our Corollary 1.5 holds only for 5 a constant, whereas their algorithm works for 5 as small as n~ el 
(which is the more interesting parameter range). 



1 Actually, [GMT06] defined Ac(r) as the minimization of }Ji L l\„i ■ But their proof of this relationship still goes 

11/ II 2 — II j II 1 

through. 
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1.1.2 Continuous-time random walks 

In [0T12], Oveis Gharan and Trevisan prove a lower bound on the probability that a random 
walk stays within a set. (Kwok and Lau [KL12] prove a similar but somewhat weaker bound.) 
Specifically, they show: 

Theorem 1.6. Let G = (V, E) be an undirected graph with invariant distribution ir. Let ^ S C V 
and let t > be an integer. Choose x ~ it conditioned on x 6 S, and then perform a t-step 
discrete-time random walk from x. Then the probability that the walk stays entirely within S is at 
l mst (l- 

We provide a simple proof of a similar theorem using Markov chain methods. 

Theorem 1.7. In the setting of Theorem 1.6, if we instead perform a time-t continuous-time 
random walk, the probability that the walk stays entirely within S is at least exp(— i^S*]). 



2 Preliminaries 

Instead of directed graphs, we will use the language of Markov chains; for background, see e.g. [DSC96, 
MT06]. 

Throughout this work, G will denote an irreducible Markov chain on state space V of cardinal- 
ity n, with no isolated states. We will be considering elements / in the vector space of functions 
V — > IR. We write K for the adjacency matrix operator: Kf(x) = ~E yr ^ x [f(y)], where y ~ x 
denotes that y is obtained by taking one step from x in the chain. K has a unique invariant 
probability distribution tt on V which is nowhere 0. It gives rise to an inner product on functions, 
if \g) = ^x~n[f(x)g(x)]. We write L = id — K for the Laplacian operator and H t = exp(-tL) for 
the heat kernel (continuous-time transition) operator. 

Definition 2.1. Given nonzero / : V — > IR we define its analytic boundary size/conductance to be 

</,L/> 1 (f,Kf) 



a,/) (/,/) 



Note that if / is the 0-1 indicator of a set S C V then $[/] = Pra^^^ [y £ S \ x G S]. We will 
also write &[S] in this case. 

Definition 2.2. Given a nonzero / : V — > IR we define its analytic sparsity to be 

m Il/lli" 

Note that if / is the 0-1 indicator of a set Scy then fi[f] = tt(S). 

These definitions motivate consideration of an "analytic" version of the Small-Set Expansion 
Problem: Assuming there is an analytically sparse / with small analytic boundary, find such an /. 
More precisely: 
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Analytic Small-Set Expansion Problem: Given as input G with the promise that there exists 
/ : V -> R with fi[f\ <5<l/2 and $[/] < e, find /' : V -)• R with //[/'] < 5' and $[/'] < e'. In 
this bicriteria problem, we typically insist that 5' = 0(S) and then try to minimize e'. 

Note that the standard Small-Set Expansion problem is the above problem with the additional 
restriction that / and /' should be 0-1-valued functions. 

For the remainder of this work we will assume that G is reversible. However, this is without loss 
of generality since, given a non-reversible Markov chain G' with adjacency matrix operator K', we 
can replace it with the reversible Markov chain G having adjacency matrix operator K = — + 2 K . 
The chain G has the same invariant distribution tt as G' which means that the notion of analytic 
sparsity is unchanged. Further, if L and U are the Laplacians of G and G', respectively, then 
(/, Lf) = (/, L'f) for any / : V — > R; hence the notion of analytic boundary is also unchanged. 

Given a reversible chain G, the operators K, L, and H% have a common orthogonal basis of 
eigenfunctions. We will write = Ai < A2 < • • • < A n for the eigenvalues of L; note that the ith 
eigenvalue of K is 1 — \ and the ith eigenvalue of Ht is exp(— t\i). All of our theorems which 
mention the eigenvalues Aj hold also for non-reversible chains G' , with the Aj's being those for the 
associated reversible chain G. 

Following [ABS10], our algorithm for the Analytic Small-Set Expansion problem (Theorem 1.4) 
breaks into two cases, depending on the "analytic nullity" of L (called "threshold rank" in [ABS10]): 

Definition 2.3. We define nullity r? (L) = #{i : A« < rj}. Note that nullity (-£) is the usual nullity. 

Remark 2.4. Throughout we will present algorithms in the model of exact arithmetic. E.g., we 
will assume that given G, the eigenvalues and eigenfunctions of L can be computed exactly. We 
believe (but have not verified) that our results can be extended to standard computational models 
(e.g., Turing machines). 

3 A new bound on the spectral profile 

Here we give our new spectral criterion, based on the trace of the heat kernel, which ensures the 
existence of an analytically sparse function with small analytic boundary. 

Theorem 3.1. Fix < 7 < 1 < A and suppose there exists t > such that 

tr(H t ) - itr(Lflt) > A. (4) 

Then in poly(n) time one can find g : V — > R-° satisfying fj\g\ < 1/A and $>[g] < 7. 

Proof. Let <j> x = • l x for x € V, so Ef^] = 1. Write <p' x = yjir(x) ■ <f) x , so the collection (4>' X ) X £V 
forms an orthonormal basis. Since trace is "the sum of the diagonal entries" , we have 

tr(flt) = ^2(<H,H t <H) = K{x){4> x ,H t 4> x ) = BjH t/2 x ,H t/2 4> x ). 

Similarly, ti(LH t ) = ~E x ^i T [{Ht/24 > x-,LH t /24> x )}- Thus the assumption (4) implies 

E [{H t/2 cl> x ,H t/2 (j> x ) - ±{H m cj) x ,LH t/2( l> x )] > A. 
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Select (in poly(n) time) a particular xq E V achieving at least A in this expectation. We define 
g = -ffi/2<Ar an d therefore we have 

(g,g)-±(g,Lg)>A. (5) 

Note that g > since <p xo > and H t / 2 is positivity-preserving. Thus ||g||i = E[g] = E[^ ] = 1. 
Further, from (5) we deduce (g,g) > A; thus fj,[g] < 1/A as desired. Finally, (5) certainly implies 
(9:9) ~ j(9iLg) > 0, which is equivalent to <3>[g] < 7. □ 

A straightforward calculation now shows that if L has large analytic nullity then we can get 
good bounds from Theorem 3.1: 

Corollary 3.2. Fix < 7 < 1. Let < a < | and let k = nullity aj (L). Assume k > Then 
in poly(n) time one can find g : V — > R-° satisfying $>[g] < 7 and fi[g] < 1/A, where A = 

Proof. We show that (4) from Theorem 3.1 holds with 7, A, and t = ^ Inn. We have 

n n 

tr(fft) - itr(L^) = £ (1 - f ) exp(-tA 4 ) = £ (1 - ^n^. (6) 

i=l i=l 

The expression (1 — r)n~' r is decreasing for r £ [0,1]; for larger r, it attains its minimum at 
r = 1 + j^-, where it has value — en \ nn - Thus by distinguishing r = ^ ^ a in (6) we may obtain 

(6) > #{* : Aj < 0,7} • (1 - - #{i : A,, > 0,7} ■ > ^(1 " «) " 

Using a < I and A; > the above is indeed at least A = □ 

Restating the parameters yields: 

Corollary 3.3. Let < 5 < 1. If there exists a < 3 swc/i i/iai nullity a7 (L) > fn", iaen in poly(n) 
iime one can /md : V — )• R-° satisfying fi[g] < 5 and <3?[g] < 7. 

An alternative restatement of the parameters yields our main Theorem 1.3: simply take a = 

Ahk^ji and 7 = logfc n in Corollar y 3 - 2 - 

4 An algorithm for Analytic Small-Set Expansion 

In [ABS10] it is shown that when L has small analytic nullity, one can find sparse sets by brute- 
force search through low-eigenvalue eigenspace. We present a very similar algorithm for finding 
analytically sparse sets. 

Lemma 4.1. Suppose there exists f : V — > K with 

/4/] < * < 1/2, <*>[/] < e < 1/4. 

Lei 2e < 77 < 1. Then in time exp (O (nullity „ (L) log (77/e))) • poly(n) one can /md a : V — )• 1R 
satisfying 

fi[g]<S + 0(e/r ] +^o7/^)<0(5 + e/r ] ), $\g] < 77. 

Remark 4.2. It is also quite easy to show a will satisfy <J>[a] < 0(y/ejrj), which is useful if n S> e 1 / 3 . 
We will not need this parameter setting, so we omit the proof. 
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Proof. Let ipi, . . . , ip m be an orthonormal basis of eigenfunctions for L, corresponding to eigenvalues 
Ai, . . . , A n . Without loss of generality, assume ||/||2 = 1- Write m = nullity^ (L) and write U for 
the dimension-m subspace spanned by . . . , tj) m . Express / = X^ILi *^) so Yl c l = 1 by the 
orthonormality of the ip^s. We have 

n 

e > $[/] = (/, Lf) = V? > V? > V E °l 

i=l i>m i>m 

In other words, if fy denotes X]j<m c *^ then ||/ — fuM — e /v (which is at most 1/2 by the 
assumption on 77). If we define u G U to be the unit vector fu/\\fu\\2, it follows that 



||/-n|| 2 < V2e/rj. 

As in [ABS10] we can now consider all g in a .5-v/ e/77-net for the unit sphere of U. The cardinality 
of this net is exp(0(mlog(r//e))). One such g will satisfy 

\\u — g\\2 < .5a/ e/77 and hence ||/ — [| 2 < 2^/eJrj. 

For this 5 we have 
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|i < H/lli + H/-5II1 < y/J4f\ + \\f~9h <Vd + 2^ 



and hence fi[g] < 5 + 0(e/rj + y/Se/rj), as desired. Since g is a unit vector in U we may also 
immediately conclude $[g] < g- □ 

From Corollary 3.3 we know that if L has large analytic nullity then there is automatically an 
(easily findable) / : V — > K which is analytically sparse and has small analytic boundary. On 
the other hand, if L has small analytic nullity, the above lemma can solve the Analytic Small-Set 
Expansion problem in not too much time. Combining these facts lets us prove our Theorem 1.4, 
restated here for convenience: 

Theorem 1.4. For any a < ^ and C > 1, there exists an algorithm running in time exp(0(n a ) • 
^log(C/<5)) with the following guarantee: If there exists f : V — > R, with fi[f] < 5 < 1/2 and 
$[/] < e < 1/4, i/ie algorithm finds g : V -> R < <5 • (1 + 1/C) and $[5] < O(^) • e. 

Proof. Set 7 = ^5 • e; we will eventually take B = 0(C 2 ). If nullity a7 (L) > |n a then from 
Corollary 3.3 we can find g with n[g] < 5, $[g] < 7 in poly(n) time; in fact, here we don't even need 
to assume the existence of /. Otherwise, Lemma 4.1 tells us that in time exp(0(n a ) • ^log(B/5)) 
we can find a g satisfying 

M <5 + 0(i + ^) = 5-(l + 0(1/VB)), <%] < a 7 < 7- 
Thus the result follows by taking B = 0(C 2 ). □ 

5 The probability a random walk stays entirely within a set 

In [OT12] the authors show that a i-step discrete time random walk starting from a random vertex 
in S C V stays entirely within S with probability at least ^1 — -^f^J ■ We give a proof of a similar 
result for continuous-time random walks using Markov chain methods. 
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Theorem 1.7 restated. For any ^ 5 C V and reaZ i > ; Zei C(t,S) denote the probability 
that a continuous-time-t random walk, started from a random x ~ S, stays entirely within S. Then 
C(t,S) > exp(-i$[5]). 

Proof. Let us define the operator K$ as follows: Given / : V — > R, K$f(x) = E y ^ x [f(y)ls(y)], 
where Is is the indicator function for 5. We then define Ls = I — Ks and fl^s = exp(— tLs)- 
Also, let «i, . . . ,v n be the eigenvectors of Ls (which are also eigenvectors of Ks and H t; s) and let 
Ai, . . . , A n be the corresponding eigenvalues of Lg- 

Define 6'a = , 1 • I5. We can then write 6'a = V- c%Vi for some constants Cj. Since ll^'Jk = 1 

it follows that ^ cf = 1. 

First, we will show that $[<S] = ^ c?Aj. 

$[5] = Pr [y ^5 I x G 5] 

= Pr \y 4S A a; G 51/ Pr [as G 51 



= i)E [l s (aj)(l s (x)- E [l s (y)])] 
= ^Ejl s (x)(Ils(x)-K s l s (x))} 

Now we show that C(t,S) = Yli c l ex P( — ^i)- Let «Jo,...,mv be the states of a time-t 
continuous-time random walk in G; note that this is the same as a r-step discrete-time random 
walk, where r ~ Poisson(i). Let W denote the set of all states visited. Then: 

C(t,S) = Pr[W C 5 I w G 5] 

= ^E[ls(™ )ls(™i)...ls(™ T )] 
= i)E E [l s (*)l^ls(aO] 

cc~tt T ~Poisson(t) 

= {(t>'s^ H t,s<t)'s) 
= ^2c 2 i exp(-t\ i ). 

To complete the proof, we need to show that c f ex P( — ^i) — ex P( — c ?^)- This follows 
immediately by the convexity of the exponential function and Jensen's inequality. □ 
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